Overview

Dataset statistics

Number of variables35
Number of observations9694
Missing cells27630
Missing cells (%)8.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.2 MiB
Average record size in memory238.0 B

Variable types

Numeric14
Categorical14
Boolean7

Alerts

insulin is highly correlated with diabetesMedHigh correlation
change is highly correlated with diabetesMedHigh correlation
diabetesMed is highly correlated with insulin and 1 other fieldsHigh correlation
insulin is highly correlated with diabetesMedHigh correlation
change is highly correlated with diabetesMedHigh correlation
diabetesMed is highly correlated with insulin and 1 other fieldsHigh correlation
insulin is highly correlated with diabetesMedHigh correlation
change is highly correlated with diabetesMedHigh correlation
diabetesMed is highly correlated with insulin and 1 other fieldsHigh correlation
insulin is highly correlated with diabetesMedHigh correlation
diabetesMed is highly correlated with insulin and 1 other fieldsHigh correlation
change is highly correlated with diabetesMedHigh correlation
admission_source_code is highly correlated with admission_type_codeHigh correlation
admission_type_code is highly correlated with admission_source_codeHigh correlation
gender is highly correlated with hemoglobin_levelHigh correlation
age is highly correlated with medical_specialtyHigh correlation
admission_type_code is highly correlated with admission_source_code and 1 other fieldsHigh correlation
admission_source_code is highly correlated with admission_type_codeHigh correlation
medical_specialty is highly correlated with age and 2 other fieldsHigh correlation
diag_1 is highly correlated with medical_specialty and 2 other fieldsHigh correlation
diag_2 is highly correlated with diag_1 and 1 other fieldsHigh correlation
diag_3 is highly correlated with diag_1 and 2 other fieldsHigh correlation
number_diagnoses is highly correlated with diag_3High correlation
hemoglobin_level is highly correlated with genderHigh correlation
insulin is highly correlated with change and 1 other fieldsHigh correlation
change is highly correlated with insulin and 1 other fieldsHigh correlation
diabetesMed is highly correlated with insulin and 1 other fieldsHigh correlation
age has 293 (3.0%) missing values Missing
weight has 9389 (96.9%) missing values Missing
num_lab_procedures has 184 (1.9%) missing values Missing
num_medications has 308 (3.2%) missing values Missing
max_glu_serum has 9186 (94.8%) missing values Missing
A1Cresult has 8070 (83.2%) missing values Missing
readmitted has 200 (2.1%) missing values Missing
number_emergency is highly skewed (γ1 = 32.69060597) Skewed
Unnamed: 0 is uniformly distributed Uniform
Unnamed: 0 has unique values Unique
admission_id has unique values Unique
num_procedures has 4385 (45.2%) zeros Zeros
number_outpatient has 8093 (83.5%) zeros Zeros
number_emergency has 8613 (88.8%) zeros Zeros
number_inpatient has 6490 (66.9%) zeros Zeros

Reproduction

Analysis started2022-03-04 16:47:09.902006
Analysis finished2022-03-04 16:47:49.112894
Duration39.21 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct9694
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4846.5
Minimum0
Maximum9693
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size75.9 KiB
2022-03-04T16:47:49.251694image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile484.65
Q12423.25
median4846.5
Q37269.75
95-th percentile9208.35
Maximum9693
Range9693
Interquartile range (IQR)4846.5

Descriptive statistics

Standard deviation2798.561089
Coefficient of variation (CV)0.5774396139
Kurtosis-1.2
Mean4846.5
Median Absolute Deviation (MAD)2423.5
Skewness0
Sum46981971
Variance7831944.167
MonotonicityStrictly increasing
2022-03-04T16:47:49.399514image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
47351
 
< 0.1%
26921
 
< 0.1%
6451
 
< 0.1%
67901
 
< 0.1%
47431
 
< 0.1%
88411
 
< 0.1%
27001
 
< 0.1%
6531
 
< 0.1%
67981
 
< 0.1%
Other values (9684)9684
99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
96931
< 0.1%
96921
< 0.1%
96911
< 0.1%
96901
< 0.1%
96891
< 0.1%
96881
< 0.1%
96871
< 0.1%
96861
< 0.1%
96851
< 0.1%
96841
< 0.1%

admission_id
Real number (ℝ≥0)

UNIQUE

Distinct9694
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean91396.53858
Minimum81412
Maximum101440
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size75.9 KiB
2022-03-04T16:47:49.542513image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum81412
5-th percentile82456.95
Q186442.25
median91364
Q396385.25
95-th percentile100380.35
Maximum101440
Range20028
Interquartile range (IQR)9943

Descriptive statistics

Standard deviation5745.128963
Coefficient of variation (CV)0.0628593714
Kurtosis-1.193941082
Mean91396.53858
Median Absolute Deviation (MAD)4974
Skewness0.003909738519
Sum885998045
Variance33006506.8
MonotonicityNot monotonic
2022-03-04T16:47:49.704369image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
927711
 
< 0.1%
1014351
 
< 0.1%
890331
 
< 0.1%
868851
 
< 0.1%
980931
 
< 0.1%
891831
 
< 0.1%
816671
 
< 0.1%
892821
 
< 0.1%
880761
 
< 0.1%
838241
 
< 0.1%
Other values (9684)9684
99.9%
ValueCountFrequency (%)
814121
< 0.1%
814151
< 0.1%
814171
< 0.1%
814181
< 0.1%
814191
< 0.1%
814201
< 0.1%
814261
< 0.1%
814271
< 0.1%
814291
< 0.1%
814311
< 0.1%
ValueCountFrequency (%)
1014401
< 0.1%
1014381
< 0.1%
1014371
< 0.1%
1014351
< 0.1%
1014321
< 0.1%
1014301
< 0.1%
1014291
< 0.1%
1014221
< 0.1%
1014161
< 0.1%
1014151
< 0.1%

patient_id
Real number (ℝ≥0)

Distinct9219
Distinct (%)95.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean108440221.2
Minimum10368
Maximum378731656
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size75.9 KiB
2022-03-04T16:47:50.122211image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum10368
5-th percentile2786253.3
Q146740537
median91137042
Q3175249750.5
95-th percentile222658891.2
Maximum378731656
Range378721288
Interquartile range (IQR)128509213.5

Descriptive statistics

Standard deviation77557704.94
Coefficient of variation (CV)0.7152116076
Kurtosis-0.3056385962
Mean108440221.2
Median Absolute Deviation (MAD)66517875
Skewness0.4817043225
Sum1.051219505 × 1012
Variance6.015197596 × 1015
MonotonicityNot monotonic
2022-03-04T16:47:50.262915image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
862817406
 
0.1%
30613685
 
0.1%
7698065
 
0.1%
850268524
 
< 0.1%
741936604
 
< 0.1%
467969044
 
< 0.1%
1775793424
 
< 0.1%
1864113123
 
< 0.1%
1845547563
 
< 0.1%
994402443
 
< 0.1%
Other values (9209)9653
99.6%
ValueCountFrequency (%)
103681
< 0.1%
133201
< 0.1%
133741
< 0.1%
163441
< 0.1%
229501
< 0.1%
248221
< 0.1%
260101
< 0.1%
295201
< 0.1%
374581
< 0.1%
442441
< 0.1%
ValueCountFrequency (%)
3787316561
< 0.1%
3785156201
< 0.1%
3783907721
< 0.1%
3783389501
< 0.1%
3774073241
< 0.1%
3765696941
< 0.1%
3762382781
< 0.1%
3758470481
< 0.1%
3735359381
< 0.1%
3735146081
< 0.1%

race
Categorical

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size75.9 KiB
white
7244 
black
1798 
unknown
 
227
hispanic
 
197
other
 
168

Length

Max length8
Median length5
Mean length5.107798638
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowwhite
2nd rowblack
3rd rowblack
4th rowwhite
5th rowhispanic

Common Values

ValueCountFrequency (%)
white7244
74.7%
black1798
 
18.5%
unknown227
 
2.3%
hispanic197
 
2.0%
other168
 
1.7%
asian60
 
0.6%

Length

2022-03-04T16:47:50.406656image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-03-04T16:47:50.487031image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
white7244
74.7%
black1798
 
18.5%
unknown227
 
2.3%
hispanic197
 
2.0%
other168
 
1.7%
asian60
 
0.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

gender
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size75.9 KiB
female
5239 
male
4454 
unknown
 
1

Length

Max length7
Median length6
Mean length5.081184238
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowmale
2nd rowmale
3rd rowfemale
4th rowmale
5th rowmale

Common Values

ValueCountFrequency (%)
female5239
54.0%
male4454
45.9%
unknown1
 
< 0.1%

Length

2022-03-04T16:47:50.605540image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-03-04T16:47:50.704865image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
female5239
54.0%
male4454
45.9%
unknown1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

age
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct10
Distinct (%)0.1%
Missing293
Missing (%)3.0%
Infinite0
Infinite (%)0.0%
Mean60.86161047
Minimum0
Maximum90
Zeros10
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size75.9 KiB
2022-03-04T16:47:50.782480image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile30
Q150
median60
Q370
95-th percentile80
Maximum90
Range90
Interquartile range (IQR)20

Descriptive statistics

Standard deviation15.67428086
Coefficient of variation (CV)0.2575396993
Kurtosis0.241595252
Mean60.86161047
Median Absolute Deviation (MAD)10
Skewness-0.6052773068
Sum572160
Variance245.6830803
MonotonicityNot monotonic
2022-03-04T16:47:50.889572image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
702471
25.5%
602113
21.8%
501606
16.6%
801498
15.5%
40914
 
9.4%
30336
 
3.5%
90245
 
2.5%
20150
 
1.5%
1058
 
0.6%
010
 
0.1%
(Missing)293
 
3.0%
ValueCountFrequency (%)
010
 
0.1%
1058
 
0.6%
20150
 
1.5%
30336
 
3.5%
40914
 
9.4%
501606
16.6%
602113
21.8%
702471
25.5%
801498
15.5%
90245
 
2.5%
ValueCountFrequency (%)
90245
 
2.5%
801498
15.5%
702471
25.5%
602113
21.8%
501606
16.6%
40914
 
9.4%
30336
 
3.5%
20150
 
1.5%
1058
 
0.6%
010
 
0.1%

weight
Real number (ℝ≥0)

MISSING

Distinct8
Distinct (%)2.6%
Missing9389
Missing (%)96.9%
Infinite0
Infinite (%)0.0%
Mean76.06557377
Minimum0
Maximum175
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size75.9 KiB
2022-03-04T16:47:50.988939image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile50
Q150
median75
Q3100
95-th percentile125
Maximum175
Range175
Interquartile range (IQR)50

Descriptive statistics

Standard deviation25.34490298
Coefficient of variation (CV)0.3331980779
Kurtosis0.9805279891
Mean76.06557377
Median Absolute Deviation (MAD)25
Skewness0.3528171881
Sum23200
Variance642.364107
MonotonicityNot monotonic
2022-03-04T16:47:51.084304image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
75130
 
1.3%
5075
 
0.8%
10068
 
0.7%
12515
 
0.2%
2510
 
0.1%
1504
 
< 0.1%
02
 
< 0.1%
1751
 
< 0.1%
(Missing)9389
96.9%
ValueCountFrequency (%)
02
 
< 0.1%
2510
 
0.1%
5075
0.8%
75130
1.3%
10068
0.7%
12515
 
0.2%
1504
 
< 0.1%
1751
 
< 0.1%
ValueCountFrequency (%)
1751
 
< 0.1%
1504
 
< 0.1%
12515
 
0.2%
10068
0.7%
75130
1.3%
5075
0.8%
2510
 
0.1%
02
 
< 0.1%

admission_type_code
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size75.9 KiB
Emergency
5003 
Elective
1811 
Urgent
1759 
unknown
1119 
Trauma Center
 
2

Length

Max length13
Median length9
Mean length8.038786878
Min length6

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUrgent
2nd rowUrgent
3rd rowElective
4th rowElective
5th rowEmergency

Common Values

ValueCountFrequency (%)
Emergency5003
51.6%
Elective1811
 
18.7%
Urgent1759
 
18.1%
unknown1119
 
11.5%
Trauma Center2
 
< 0.1%

Length

2022-03-04T16:47:51.223455image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-03-04T16:47:51.321975image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
emergency5003
51.6%
elective1811
 
18.7%
urgent1759
 
18.1%
unknown1119
 
11.5%
trauma2
 
< 0.1%
center2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct20
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size75.9 KiB
Discharged to home
5786 
Discharged/transferred to SNF
1342 
Discharged/transferred to home with home health service
1244 
unknown
 
545
Discharged/transferred to another short term hospital
 
202
Other values (15)
 
575

Length

Max length105
Median length18
Mean length26.53775531
Min length7

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)0.1%

Sample

1st rowDischarged to home
2nd rowunknown
3rd rowDischarged to home
4th rowDischarged to home
5th rowDischarged/transferred to SNF

Common Values

ValueCountFrequency (%)
Discharged to home5786
59.7%
Discharged/transferred to SNF1342
 
13.8%
Discharged/transferred to home with home health service1244
 
12.8%
unknown545
 
5.6%
Discharged/transferred to another short term hospital202
 
2.1%
Discharged/transferred to another rehab fac including rehab units of a hospital181
 
1.9%
Discharged/transferred to another type of inpatient care institution118
 
1.2%
Discharged/transferred to ICF70
 
0.7%
Left AMA55
 
0.6%
Discharged/transferred to a long term care hospital50
 
0.5%
Other values (10)101
 
1.0%

Length

2022-03-04T16:47:51.430662image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
to9024
25.1%
home8332
23.1%
discharged5786
16.1%
discharged/transferred3223
 
8.9%
snf1342
 
3.7%
health1245
 
3.5%
with1244
 
3.5%
service1244
 
3.5%
unknown545
 
1.5%
another502
 
1.4%
Other values (47)3534
 
9.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

admission_source_code
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size75.9 KiB
Emergency Room
5468 
Physician Referral
2841 
unknown
679 
Transfer from a hospital
 
302
Transfer from another health care facility
 
192
Other values (5)
 
212

Length

Max length46
Median length14
Mean length15.812255
Min length7

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowPhysician Referral
2nd rowClinic Referral
3rd rowPhysician Referral
4th rowPhysician Referral
5th rowEmergency Room

Common Values

ValueCountFrequency (%)
Emergency Room5468
56.4%
Physician Referral2841
29.3%
unknown679
 
7.0%
Transfer from a hospital302
 
3.1%
Transfer from another health care facility192
 
2.0%
Clinic Referral114
 
1.2%
Transfer from a Skilled Nursing Facility (SNF)76
 
0.8%
HMO Referral19
 
0.2%
Transfer from critial access hospital2
 
< 0.1%
Court/Law Enforcement1
 
< 0.1%

Length

2022-03-04T16:47:51.597036image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-03-04T16:47:51.701526image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
emergency5468
26.7%
room5468
26.7%
referral2974
14.5%
physician2841
13.9%
unknown679
 
3.3%
transfer572
 
2.8%
from572
 
2.8%
a378
 
1.8%
hospital304
 
1.5%
facility268
 
1.3%
Other values (12)943
 
4.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

time_in_hospital
Real number (ℝ≥0)

Distinct14
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.389725603
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size75.9 KiB
2022-03-04T16:47:51.852454image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile11
Maximum14
Range13
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.004512179
Coefficient of variation (CV)0.6844419107
Kurtosis0.8095413805
Mean4.389725603
Median Absolute Deviation (MAD)2
Skewness1.130620534
Sum42554
Variance9.027093435
MonotonicityNot monotonic
2022-03-04T16:47:52.031014image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
31692
17.5%
21656
17.1%
11388
14.3%
41275
13.2%
5945
9.7%
6720
7.4%
7544
 
5.6%
8421
 
4.3%
9293
 
3.0%
10214
 
2.2%
Other values (4)546
 
5.6%
ValueCountFrequency (%)
11388
14.3%
21656
17.1%
31692
17.5%
41275
13.2%
5945
9.7%
6720
7.4%
7544
 
5.6%
8421
 
4.3%
9293
 
3.0%
10214
 
2.2%
ValueCountFrequency (%)
1493
 
1.0%
13128
 
1.3%
12139
 
1.4%
11186
 
1.9%
10214
 
2.2%
9293
 
3.0%
8421
4.3%
7544
5.6%
6720
7.4%
5945
9.7%

payer_code
Categorical

Distinct17
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size75.9 KiB
unknown
3843 
MC
3038 
HM
617 
SP
491 
BC
453 
Other values (12)
1252 

Length

Max length7
Median length2
Mean length3.98215391
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBC
2nd rowunknown
3rd rowBC
4th rowMC
5th rowunknown

Common Values

ValueCountFrequency (%)
unknown3843
39.6%
MC3038
31.3%
HM617
 
6.4%
SP491
 
5.1%
BC453
 
4.7%
MD332
 
3.4%
UN244
 
2.5%
CP213
 
2.2%
CM195
 
2.0%
OG120
 
1.2%
Other values (7)148
 
1.5%

Length

2022-03-04T16:47:52.161559image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
unknown3843
39.6%
mc3038
31.3%
hm617
 
6.4%
sp491
 
5.1%
bc453
 
4.7%
md332
 
3.4%
un244
 
2.5%
cp213
 
2.2%
cm195
 
2.0%
og120
 
1.2%
Other values (7)148
 
1.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

medical_specialty
Categorical

HIGH CORRELATION

Distinct24
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size75.9 KiB
Others
4729 
InternalMedicine
1355 
Emergency/Trauma
759 
Family/GeneralPractice
738 
Cardiology
507 
Other values (19)
1606 

Length

Max length22
Median length7
Mean length10.25562203
Min length6

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowEndocrinology
2nd rowOthers
3rd rowFamily/GeneralPractice
4th rowRadiology
5th rowOthers

Common Values

ValueCountFrequency (%)
Others4729
48.8%
InternalMedicine1355
 
14.0%
Emergency/Trauma759
 
7.8%
Family/GeneralPractice738
 
7.6%
Cardiology507
 
5.2%
Surgery502
 
5.2%
Orthopedics271
 
2.8%
Radiology133
 
1.4%
Nephrology122
 
1.3%
Psychiatry104
 
1.1%
Other values (14)474
 
4.9%

Length

2022-03-04T16:47:52.293253image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
others4729
48.8%
internalmedicine1355
 
14.0%
emergency/trauma759
 
7.8%
family/generalpractice738
 
7.6%
cardiology507
 
5.2%
surgery502
 
5.2%
orthopedics271
 
2.8%
radiology133
 
1.4%
nephrology122
 
1.3%
psychiatry104
 
1.1%
Other values (14)474
 
4.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.6 KiB
False
9599 
True
 
95
ValueCountFrequency (%)
False9599
99.0%
True95
 
1.0%
2022-03-04T16:47:52.388147image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size75.9 KiB
complete
8106 
incomplete
1550 
none
 
38

Length

Max length10
Median length8
Mean length8.304105632
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcomplete
2nd rowcomplete
3rd rowcomplete
4th rowcomplete
5th rowcomplete

Common Values

ValueCountFrequency (%)
complete8106
83.6%
incomplete1550
 
16.0%
none38
 
0.4%

Length

2022-03-04T16:47:52.471382image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-03-04T16:47:52.544148image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
complete8106
83.6%
incomplete1550
 
16.0%
none38
 
0.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

num_lab_procedures
Real number (ℝ≥0)

MISSING

Distinct105
Distinct (%)1.1%
Missing184
Missing (%)1.9%
Infinite0
Infinite (%)0.0%
Mean42.94090431
Minimum1
Maximum111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size75.9 KiB
2022-03-04T16:47:52.644576image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q131
median44
Q357
95-th percentile73
Maximum111
Range110
Interquartile range (IQR)26

Descriptive statistics

Standard deviation19.86203542
Coefficient of variation (CV)0.4625434824
Kurtosis-0.2834919112
Mean42.94090431
Median Absolute Deviation (MAD)13
Skewness-0.2405578687
Sum408368
Variance394.500451
MonotonicityNot monotonic
2022-03-04T16:47:52.782440image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1320
 
3.3%
43255
 
2.6%
42219
 
2.3%
49218
 
2.2%
45217
 
2.2%
44216
 
2.2%
39206
 
2.1%
40206
 
2.1%
46203
 
2.1%
37198
 
2.0%
Other values (95)7252
74.8%
ValueCountFrequency (%)
1320
3.3%
2109
 
1.1%
356
 
0.6%
437
 
0.4%
531
 
0.3%
634
 
0.4%
724
 
0.2%
843
 
0.4%
985
 
0.9%
1092
 
0.9%
ValueCountFrequency (%)
1111
 
< 0.1%
1091
 
< 0.1%
1071
 
< 0.1%
1031
 
< 0.1%
1021
 
< 0.1%
1012
< 0.1%
1003
< 0.1%
981
 
< 0.1%
974
< 0.1%
961
 
< 0.1%

num_procedures
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.349391376
Minimum0
Maximum6
Zeros4385
Zeros (%)45.2%
Negative0
Negative (%)0.0%
Memory size75.9 KiB
2022-03-04T16:47:52.896727image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile5
Maximum6
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.703723337
Coefficient of variation (CV)1.262586502
Kurtosis0.836890554
Mean1.349391376
Median Absolute Deviation (MAD)1
Skewness1.307333976
Sum13081
Variance2.902673208
MonotonicityNot monotonic
2022-03-04T16:47:52.988681image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
04385
45.2%
12007
20.7%
21234
 
12.7%
3914
 
9.4%
6461
 
4.8%
4367
 
3.8%
5326
 
3.4%
ValueCountFrequency (%)
04385
45.2%
12007
20.7%
21234
 
12.7%
3914
 
9.4%
4367
 
3.8%
5326
 
3.4%
6461
 
4.8%
ValueCountFrequency (%)
6461
 
4.8%
5326
 
3.4%
4367
 
3.8%
3914
 
9.4%
21234
 
12.7%
12007
20.7%
04385
45.2%

num_medications
Real number (ℝ≥0)

MISSING

Distinct64
Distinct (%)0.7%
Missing308
Missing (%)3.2%
Infinite0
Infinite (%)0.0%
Mean16.0644577
Minimum1
Maximum79
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size75.9 KiB
2022-03-04T16:47:53.122435image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q110
median15
Q320
95-th percentile31
Maximum79
Range78
Interquartile range (IQR)10

Descriptive statistics

Standard deviation8.277705312
Coefficient of variation (CV)0.5152807188
Kurtosis3.558316821
Mean16.0644577
Median Absolute Deviation (MAD)5
Skewness1.382377006
Sum150781
Variance68.52040523
MonotonicityNot monotonic
2022-03-04T16:47:53.270348image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15568
 
5.9%
13566
 
5.8%
12557
 
5.7%
11551
 
5.7%
10510
 
5.3%
14502
 
5.2%
16482
 
5.0%
9445
 
4.6%
17445
 
4.6%
18406
 
4.2%
Other values (54)4354
44.9%
ValueCountFrequency (%)
126
 
0.3%
251
 
0.5%
383
 
0.9%
4109
 
1.1%
5174
 
1.8%
6242
2.5%
7355
3.7%
8406
4.2%
9445
4.6%
10510
5.3%
ValueCountFrequency (%)
791
 
< 0.1%
661
 
< 0.1%
651
 
< 0.1%
614
< 0.1%
603
< 0.1%
593
< 0.1%
586
0.1%
573
< 0.1%
564
< 0.1%
553
< 0.1%

number_outpatient
Real number (ℝ≥0)

ZEROS

Distinct19
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3634206726
Minimum0
Maximum21
Zeros8093
Zeros (%)83.5%
Negative0
Negative (%)0.0%
Memory size75.9 KiB
2022-03-04T16:47:53.399353image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum21
Range21
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.187414181
Coefficient of variation (CV)3.267327014
Kurtosis70.18884936
Mean0.3634206726
Median Absolute Deviation (MAD)0
Skewness6.716536954
Sum3523
Variance1.409952437
MonotonicityNot monotonic
2022-03-04T16:47:53.503047image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
08093
83.5%
1824
 
8.5%
2330
 
3.4%
3212
 
2.2%
4106
 
1.1%
549
 
0.5%
630
 
0.3%
811
 
0.1%
78
 
0.1%
115
 
0.1%
Other values (9)26
 
0.3%
ValueCountFrequency (%)
08093
83.5%
1824
 
8.5%
2330
 
3.4%
3212
 
2.2%
4106
 
1.1%
549
 
0.5%
630
 
0.3%
78
 
0.1%
811
 
0.1%
95
 
0.1%
ValueCountFrequency (%)
212
 
< 0.1%
191
 
< 0.1%
173
< 0.1%
161
 
< 0.1%
154
< 0.1%
144
< 0.1%
132
 
< 0.1%
115
0.1%
104
< 0.1%
95
0.1%

number_emergency
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct15
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1893955024
Minimum0
Maximum63
Zeros8613
Zeros (%)88.8%
Negative0
Negative (%)0.0%
Memory size75.9 KiB
2022-03-04T16:47:53.604305image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum63
Range63
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.952079387
Coefficient of variation (CV)5.026937678
Kurtosis1978.568005
Mean0.1893955024
Median Absolute Deviation (MAD)0
Skewness32.69060597
Sum1836
Variance0.9064551592
MonotonicityNot monotonic
2022-03-04T16:47:53.727491image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
08613
88.8%
1747
 
7.7%
2195
 
2.0%
357
 
0.6%
442
 
0.4%
58
 
0.1%
78
 
0.1%
86
 
0.1%
66
 
0.1%
104
 
< 0.1%
Other values (5)8
 
0.1%
ValueCountFrequency (%)
08613
88.8%
1747
 
7.7%
2195
 
2.0%
357
 
0.6%
442
 
0.4%
58
 
0.1%
66
 
0.1%
78
 
0.1%
86
 
0.1%
92
 
< 0.1%
ValueCountFrequency (%)
631
 
< 0.1%
131
 
< 0.1%
122
 
< 0.1%
112
 
< 0.1%
104
< 0.1%
92
 
< 0.1%
86
0.1%
78
0.1%
66
0.1%
58
0.1%

number_inpatient
Real number (ℝ≥0)

ZEROS

Distinct17
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6262636682
Minimum0
Maximum16
Zeros6490
Zeros (%)66.9%
Negative0
Negative (%)0.0%
Memory size75.9 KiB
2022-03-04T16:47:53.828404image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum16
Range16
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.254516375
Coefficient of variation (CV)2.00317604
Kurtosis21.20301943
Mean0.6262636682
Median Absolute Deviation (MAD)0
Skewness3.647348129
Sum6071
Variance1.573811335
MonotonicityNot monotonic
2022-03-04T16:47:53.930703image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
06490
66.9%
11837
 
18.9%
2694
 
7.2%
3326
 
3.4%
4156
 
1.6%
585
 
0.9%
647
 
0.5%
817
 
0.2%
716
 
0.2%
108
 
0.1%
Other values (7)18
 
0.2%
ValueCountFrequency (%)
06490
66.9%
11837
 
18.9%
2694
 
7.2%
3326
 
3.4%
4156
 
1.6%
585
 
0.9%
647
 
0.5%
716
 
0.2%
817
 
0.2%
96
 
0.1%
ValueCountFrequency (%)
161
 
< 0.1%
152
 
< 0.1%
142
 
< 0.1%
131
 
< 0.1%
122
 
< 0.1%
114
 
< 0.1%
108
0.1%
96
 
0.1%
817
0.2%
716
0.2%

diag_1
Categorical

HIGH CORRELATION

Distinct18
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size75.9 KiB
diseases of the circulatory system
2954 
endocrine, nutritional and metabolic diseases, and immunity disorders
1054 
diseases of the respiratory system
953 
diseases of the digestive system
864 
symptoms, signs, and ill-defined conditions
708 
Other values (13)
3161 

Length

Max length69
Median length34
Mean length37.9945327
Min length7

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowdiseases of the digestive system
2nd rowdiseases of the circulatory system
3rd rowdiseases of the digestive system
4th rowdiseases of the circulatory system
5th rowinjury and poisoning

Common Values

ValueCountFrequency (%)
diseases of the circulatory system2954
30.5%
endocrine, nutritional and metabolic diseases, and immunity disorders1054
 
10.9%
diseases of the respiratory system953
 
9.8%
diseases of the digestive system864
 
8.9%
symptoms, signs, and ill-defined conditions708
 
7.3%
injury and poisoning648
 
6.7%
diseases of the genitourinary system500
 
5.2%
diseases of the musculoskeletal system and connective tissue489
 
5.0%
neoplasms333
 
3.4%
infectious and parasitic diseases264
 
2.7%
Other values (8)927
 
9.6%

Length

2022-03-04T16:47:54.053129image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
diseases7563
15.0%
of6464
12.8%
the6297
12.5%
system5878
11.6%
and4754
 
9.4%
circulatory2954
 
5.8%
disorders1272
 
2.5%
nutritional1054
 
2.1%
metabolic1054
 
2.1%
immunity1054
 
2.1%
Other values (33)12216
24.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

diag_2
Categorical

HIGH CORRELATION

Distinct18
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size75.9 KiB
diseases of the circulatory system
2960 
endocrine, nutritional and metabolic diseases, and immunity disorders
1991 
diseases of the respiratory system
919 
diseases of the genitourinary system
754 
symptoms, signs, and ill-defined conditions
434 
Other values (13)
2636 

Length

Max length69
Median length34
Mean length40.73179286
Min length7

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowendocrine, nutritional and metabolic diseases, and immunity disorders
2nd rowdiseases of the genitourinary system
3rd rowdiseases of the genitourinary system
4th rowdiseases of the circulatory system
5th rowinjury and poisoning

Common Values

ValueCountFrequency (%)
diseases of the circulatory system2960
30.5%
endocrine, nutritional and metabolic diseases, and immunity disorders1991
20.5%
diseases of the respiratory system919
 
9.5%
diseases of the genitourinary system754
 
7.8%
symptoms, signs, and ill-defined conditions434
 
4.5%
diseases of the digestive system369
 
3.8%
diseases of the skin and subcutaneous tissue323
 
3.3%
mental disorders282
 
2.9%
diseases of the blood and blood-forming organs265
 
2.7%
external causes of injury238
 
2.5%
Other values (8)1159
 
12.0%

Length

2022-03-04T16:47:54.171095image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
diseases8034
15.1%
of6142
11.6%
the5904
11.1%
and5718
10.8%
system5284
10.0%
circulatory2960
 
5.6%
disorders2273
 
4.3%
endocrine1991
 
3.8%
nutritional1991
 
3.8%
metabolic1991
 
3.8%
Other values (33)10753
20.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

diag_3
Categorical

HIGH CORRELATION

Distinct18
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size75.9 KiB
diseases of the circulatory system
2893 
endocrine, nutritional and metabolic diseases, and immunity disorders
2587 
diseases of the respiratory system
584 
diseases of the genitourinary system
574 
external causes of injury
496 
Other values (13)
2560 

Length

Max length69
Median length34
Mean length42.90406437
Min length7

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowdiseases of the circulatory system
2nd rowdiseases of the respiratory system
3rd rowdiseases of the genitourinary system
4th rowmental disorders
5th rowinjury and poisoning

Common Values

ValueCountFrequency (%)
diseases of the circulatory system2893
29.8%
endocrine, nutritional and metabolic diseases, and immunity disorders2587
26.7%
diseases of the respiratory system584
 
6.0%
diseases of the genitourinary system574
 
5.9%
external causes of injury496
 
5.1%
symptoms, signs, and ill-defined conditions403
 
4.2%
diseases of the digestive system317
 
3.3%
mental disorders303
 
3.1%
diseases of the blood and blood-forming organs242
 
2.5%
diseases of the skin and subcutaneous tissue210
 
2.2%
Other values (8)1085
 
11.2%

Length

2022-03-04T16:47:54.290655image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
diseases7942
14.4%
and6786
12.3%
of5708
10.4%
the5212
9.5%
system4733
8.6%
circulatory2893
 
5.3%
disorders2890
 
5.3%
endocrine2587
 
4.7%
nutritional2587
 
4.7%
metabolic2587
 
4.7%
Other values (33)11088
20.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

number_diagnoses
Real number (ℝ≥0)

HIGH CORRELATION

Distinct14
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.41252321
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size75.9 KiB
2022-03-04T16:47:54.395696image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q16
median8
Q39
95-th percentile9
Maximum16
Range15
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.954272453
Coefficient of variation (CV)0.2636446993
Kurtosis-0.1242502461
Mean7.41252321
Median Absolute Deviation (MAD)1
Skewness-0.9111943497
Sum71857
Variance3.819180819
MonotonicityNot monotonic
2022-03-04T16:47:54.493087image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
94746
49.0%
51038
 
10.7%
81013
 
10.4%
6984
 
10.2%
7935
 
9.6%
4542
 
5.6%
3298
 
3.1%
2109
 
1.1%
122
 
0.2%
162
 
< 0.1%
Other values (4)5
 
0.1%
ValueCountFrequency (%)
122
 
0.2%
2109
 
1.1%
3298
 
3.1%
4542
 
5.6%
51038
 
10.7%
6984
 
10.2%
7935
 
9.6%
81013
 
10.4%
94746
49.0%
101
 
< 0.1%
ValueCountFrequency (%)
162
 
< 0.1%
152
 
< 0.1%
141
 
< 0.1%
121
 
< 0.1%
101
 
< 0.1%
94746
49.0%
81013
 
10.4%
7935
 
9.6%
6984
 
10.2%
51038
 
10.7%

blood_type
Categorical

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size75.9 KiB
O+
3799 
A+
2968 
B+
1074 
O-
702 
A-
578 
Other values (3)
573 

Length

Max length3
Median length2
Mean length2.042603672
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA+
2nd rowO-
3rd rowA+
4th rowO-
5th rowO+

Common Values

ValueCountFrequency (%)
O+3799
39.2%
A+2968
30.6%
B+1074
 
11.1%
O-702
 
7.2%
A-578
 
6.0%
AB+321
 
3.3%
B-160
 
1.7%
AB-92
 
0.9%

Length

2022-03-04T16:47:54.600623image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-03-04T16:47:54.679331image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
o4501
46.4%
a3546
36.6%
b1234
 
12.7%
ab413
 
4.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

hemoglobin_level
Real number (ℝ≥0)

HIGH CORRELATION

Distinct70
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.18680627
Minimum10.9
Maximum18
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size75.9 KiB
2022-03-04T16:47:54.799814image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum10.9
5-th percentile12.6
Q113.4
median14.1
Q314.9
95-th percentile16
Maximum18
Range7.1
Interquartile range (IQR)1.5

Descriptive statistics

Standard deviation1.050907971
Coefficient of variation (CV)0.07407643069
Kurtosis-0.4462990845
Mean14.18680627
Median Absolute Deviation (MAD)0.8
Skewness0.189357108
Sum137526.9
Variance1.104407564
MonotonicityNot monotonic
2022-03-04T16:47:54.949937image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13.6354
 
3.7%
13.8348
 
3.6%
13.5337
 
3.5%
14.5326
 
3.4%
13.7323
 
3.3%
13.4318
 
3.3%
13.9316
 
3.3%
14315
 
3.2%
14.4313
 
3.2%
14.3313
 
3.2%
Other values (60)6431
66.3%
ValueCountFrequency (%)
10.91
 
< 0.1%
11.11
 
< 0.1%
11.22
 
< 0.1%
11.33
 
< 0.1%
11.42
 
< 0.1%
11.52
 
< 0.1%
11.67
 
0.1%
11.711
0.1%
11.813
0.1%
11.922
0.2%
ValueCountFrequency (%)
181
 
< 0.1%
17.81
 
< 0.1%
17.72
 
< 0.1%
17.61
 
< 0.1%
17.52
 
< 0.1%
17.43
 
< 0.1%
17.31
 
< 0.1%
17.21
 
< 0.1%
17.18
0.1%
178
0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.6 KiB
False
8522 
True
1172 
ValueCountFrequency (%)
False8522
87.9%
True1172
 
12.1%
2022-03-04T16:47:55.053194image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

max_glu_serum
Categorical

MISSING

Distinct3
Distinct (%)0.6%
Missing9186
Missing (%)94.8%
Memory size75.9 KiB
85.0
260 
200.0
147 
300.0
101 

Length

Max length5
Median length4
Mean length4.488188976
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row85.0
2nd row85.0
3rd row200.0
4th row300.0
5th row200.0

Common Values

ValueCountFrequency (%)
85.0260
 
2.7%
200.0147
 
1.5%
300.0101
 
1.0%
(Missing)9186
94.8%

Length

2022-03-04T16:47:55.137016image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-03-04T16:47:55.203526image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
85.0260
51.2%
200.0147
28.9%
300.0101
 
19.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

A1Cresult
Categorical

MISSING

Distinct3
Distinct (%)0.2%
Missing8070
Missing (%)83.2%
Memory size75.9 KiB
8.0
798 
5.0
478 
7.0
348 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row8.0
2nd row7.0
3rd row5.0
4th row7.0
5th row8.0

Common Values

ValueCountFrequency (%)
8.0798
 
8.2%
5.0478
 
4.9%
7.0348
 
3.6%
(Missing)8070
83.2%

Length

2022-03-04T16:47:55.293273image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-03-04T16:47:55.367854image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
8.0798
49.1%
5.0478
29.4%
7.0348
21.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

diuretics
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.6 KiB
False
9515 
True
 
179
ValueCountFrequency (%)
False9515
98.2%
True179
 
1.8%
2022-03-04T16:47:55.411591image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

insulin
Boolean

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.6 KiB
True
5257 
False
4437 
ValueCountFrequency (%)
True5257
54.2%
False4437
45.8%
2022-03-04T16:47:55.456103image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

change
Boolean

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.6 KiB
False
5165 
True
4529 
ValueCountFrequency (%)
False5165
53.3%
True4529
46.7%
2022-03-04T16:47:55.499449image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

diabetesMed
Boolean

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.6 KiB
True
7493 
False
2201 
ValueCountFrequency (%)
True7493
77.3%
False2201
 
22.7%
2022-03-04T16:47:55.543952image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

readmitted
Boolean

MISSING

Distinct2
Distinct (%)< 0.1%
Missing200
Missing (%)2.1%
Memory size75.9 KiB
False
8420 
True
1074 
(Missing)
 
200
ValueCountFrequency (%)
False8420
86.9%
True1074
 
11.1%
(Missing)200
 
2.1%
2022-03-04T16:47:55.588479image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Interactions

2022-03-04T16:47:44.781023image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:15.896825image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:18.302937image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:20.238175image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:22.277251image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:24.243252image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:26.374876image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:28.287959image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:30.154733image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:32.276596image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:34.420457image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:37.022220image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:40.254422image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:42.639115image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:44.915205image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:16.021816image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:18.436293image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:20.371637image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:22.403594image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:24.397377image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:26.508624image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:28.410872image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:30.290640image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:32.415055image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:34.587914image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:37.169429image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:40.460853image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:42.787633image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:45.062664image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:16.159497image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:18.577883image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:20.522403image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:22.538838image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:24.549221image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:26.654688image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:28.555003image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:30.436030image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:32.561627image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:34.815594image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:37.383343image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:40.688678image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:42.933247image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:45.212630image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:16.295519image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:18.718234image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:20.673728image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:22.669993image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:24.698923image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:26.791969image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:28.697122image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:30.585400image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:32.723627image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:34.998784image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:37.623288image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:40.898128image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:43.103825image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:45.352632image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:16.411857image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:18.836689image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:20.801552image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:22.785667image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:24.853449image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:26.923721image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:28.820322image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:30.718792image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:32.867618image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:35.153713image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:37.795774image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:41.042152image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:43.251268image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:45.496180image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:16.537090image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:18.976551image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:21.040826image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:22.921431image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:24.991416image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:27.053250image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:28.953480image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:30.858451image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:33.007951image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:35.364002image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:38.112588image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:41.176414image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:43.395394image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:45.649449image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:16.662772image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:19.115742image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:21.170627image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:23.049133image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:25.120714image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:27.189426image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:29.081608image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:30.996737image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:33.207742image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:35.600670image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:38.355565image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:41.340864image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:43.538099image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:45.792017image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:17.368339image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:19.246762image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:21.310831image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:23.174013image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:25.312072image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:27.344192image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:29.204072image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:31.294098image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:33.357628image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:35.841222image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:38.532567image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:41.485257image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:43.704307image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:45.935420image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:17.501922image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:19.387784image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:21.444270image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:23.304543image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:25.453515image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:27.479787image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:29.332040image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:31.428734image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:33.505451image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:36.069411image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:38.708350image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:41.646512image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:43.838810image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:46.084476image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:17.643485image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:19.534074image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:21.589253image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:23.444441image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:25.715872image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:27.620012image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:29.463152image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:31.572347image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:33.659021image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:36.300739image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:38.937681image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:41.821356image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:43.994331image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:46.218489image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:17.767768image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:19.665339image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:21.716925image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:23.592447image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:25.837963image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:27.741518image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:29.588856image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:31.699233image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:33.792697image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:36.431539image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:39.124324image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:42.001264image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:44.140591image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:46.356457image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:17.899034image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:19.805341image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:21.852077image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:23.725706image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:25.976977image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:27.879703image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:29.722806image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:31.842641image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:33.944117image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:36.563445image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:39.311040image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:42.162296image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:44.297395image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:46.515109image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:18.034572image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:19.956765image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:21.996634image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:23.883476image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:26.111640image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:28.020060image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:29.865729image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:31.992740image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:34.111708image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:36.693984image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:39.600104image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:42.311681image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:44.453943image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:46.676640image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:18.169702image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:20.094394image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:22.132587image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:24.064289image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:26.245731image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:28.154928image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:30.013805image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:32.137792image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:34.264014image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:36.846130image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:39.764356image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:42.472658image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-04T16:47:44.613749image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2022-03-04T16:47:55.715199image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-03-04T16:47:56.029022image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-03-04T16:47:56.373874image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-03-04T16:47:56.725929image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-03-04T16:47:57.060253image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-03-04T16:47:47.062087image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-03-04T16:47:48.201577image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-03-04T16:47:48.627198image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-03-04T16:47:48.853403image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

Unnamed: 0admission_idpatient_idracegenderageweightadmission_type_codedischarge_disposition_codeadmission_source_codetime_in_hospitalpayer_codemedical_specialtyhas_prosthesiscomplete_vaccination_statusnum_lab_proceduresnum_proceduresnum_medicationsnumber_outpatientnumber_emergencynumber_inpatientdiag_1diag_2diag_3number_diagnosesblood_typehemoglobin_levelblood_transfusionmax_glu_serumA1CresultdiureticsinsulinchangediabetesMedreadmitted
00100706.012224646.0whitemale40.0NaNUrgentDischarged to homePhysician Referral3BCEndocrinologyFalsecomplete64.0010.0000diseases of the digestive systemendocrine, nutritional and metabolic diseases, and immunity disordersdiseases of the circulatory system5A+14.2FalseNaN8.0FalseTrueFalseTrueFalse
1199518.01020438.0blackmale60.0NaNUrgentunknownClinic Referral2unknownOthersFalsecomplete32.0115.0404diseases of the circulatory systemdiseases of the genitourinary systemdiseases of the respiratory system9O-14.1FalseNaNNaNFalseTrueFalseTrueFalse
2289397.0141934014.0blackfemale60.0NaNElectiveDischarged to homePhysician Referral5BCFamily/GeneralPracticeFalsecomplete57.0230.0000diseases of the digestive systemdiseases of the genitourinary systemdiseases of the genitourinary system9A+13.4FalseNaNNaNFalseTrueFalseTrueFalse
3389653.0168821964.0whitemale50.0NaNElectiveDischarged to homePhysician Referral1MCRadiologyFalsecomplete34.0624.0001diseases of the circulatory systemdiseases of the circulatory systemmental disorders9O-15.3FalseNaNNaNTrueTrueTrueTrueFalse
4483278.06485868.0hispanicmale30.0NaNEmergencyDischarged/transferred to SNFEmergency Room14unknownOthersFalsecomplete89.0648.0000injury and poisoninginjury and poisoninginjury and poisoning9O+14.6TrueNaNNaNFalseTrueTrueTrueFalse
5583713.0123423750.0whitemale40.0NaNEmergencyDischarged/transferred to home with home health serviceEmergency Room7unknownOthersFalsecomplete3.0011.0001diseases of the circulatory systemdiseases of the circulatory systemdiseases of the circulatory system9A+15.4FalseNaNNaNFalseTrueTrueTrueFalse
66101072.0114696756.0whitemale60.0NaNEmergencyDischarged to homeEmergency Room2MCOthersFalsecomplete44.0017.0000injury and poisoningunknowndiseases of the circulatory system9A+14.8FalseNaNNaNFalseTrueTrueTrueTrue
7790645.011351394.0whitefemale60.0NaNElectiveDischarged/transferred to SNFPhysician Referral3unknownOrthopedicsFalsecomplete26.0132.0001diseases of the musculoskeletal system and connective tissueendocrine, nutritional and metabolic diseases, and immunity disordersdiseases of the circulatory system9A+14.2FalseNaNNaNFalseTrueFalseTrueFalse
8897867.0141851502.0whitemale80.0NaNunknownDischarged/transferred to SNFEmergency Room4unknownInternalMedicineFalsecomplete41.008.0000injury and poisoninginjury and poisoningexternal causes of injury6O+18.0False85.0NaNFalseTrueFalseTrueFalse
9989885.0140092992.0whitemale70.0NaNEmergencyDischarged to homeEmergency Room3unknownOthersFalsecomplete44.0013.0000diseases of the circulatory systemendocrine, nutritional and metabolic diseases, and immunity disordersdiseases of the circulatory system9O+15.6FalseNaN7.0FalseTrueTrueTrueFalse

Last rows

Unnamed: 0admission_idpatient_idracegenderageweightadmission_type_codedischarge_disposition_codeadmission_source_codetime_in_hospitalpayer_codemedical_specialtyhas_prosthesiscomplete_vaccination_statusnum_lab_proceduresnum_proceduresnum_medicationsnumber_outpatientnumber_emergencynumber_inpatientdiag_1diag_2diag_3number_diagnosesblood_typehemoglobin_levelblood_transfusionmax_glu_serumA1CresultdiureticsinsulinchangediabetesMedreadmitted
9684968481825.0165320082.0whitefemale70.0NaNEmergencyDischarged to homeEmergency Room2MCInternalMedicineFalsecomplete60.0011.0000endocrine, nutritional and metabolic diseases, and immunity disordersdiseases of the respiratory systemdiseases of the genitourinary system9O+14.0FalseNaNNaNFalseFalseFalseTrueFalse
9685968596103.084284514.0whitemale70.0NaNEmergencyDischarged/transferred to SNFEmergency Room3MCOthersFalsecomplete53.0018.0001diseases of the circulatory systemendocrine, nutritional and metabolic diseases, and immunity disordersendocrine, nutritional and metabolic diseases, and immunity disorders9O+17.4FalseNaN8.0FalseTrueTrueTrueFalse
9686968696970.087469974.0whitefemale60.0NaNElectiveDischarged to homePhysician Referral7MCOthersFalsecomplete21.0411.0000neoplasmsdiseases of the circulatory systemendocrine, nutritional and metabolic diseases, and immunity disorders4O+13.0FalseNaNNaNFalseTrueTrueTrueFalse
9687968799920.082030878.0whitemale60.0NaNEmergencyDischarged to homeEmergency Room2SPOthersFalsecomplete37.018.0001diseases of the digestive systemdiseases of the blood and blood-forming organsinfectious and parasitic diseases7A-15.1FalseNaNNaNFalseFalseFalseFalseFalse
9688968888037.012464442.0unknownmale70.0NaNUrgentDischarged to homeunknown2MCInternalMedicineFalsecomplete41.0015.0000diseases of the circulatory systemendocrine, nutritional and metabolic diseases, and immunity disordersdiseases of the circulatory system5A+14.0FalseNaNNaNFalseTrueTrueTrueFalse
9689968991851.017594028.0whitefemale70.0NaNEmergencyunknownEmergency Room3unknownFamily/GeneralPracticeFalseincomplete52.007.0000diseases of the digestive systemdiseases of the circulatory systemexternal causes of injury9A+13.0FalseNaNNaNFalseFalseFalseFalseFalse
9690969084067.046722582.0whitemale80.0NaNunknownunknownunknown7unknownFamily/GeneralPracticeFalsecomplete25.0015.0202injury and poisoningdiseases of the circulatory systemdiseases of the respiratory system9B+15.2False85.0NaNFalseFalseFalseTrueFalse
9691969185961.083528442.0whitemale70.0NaNunknownDischarged to homePhysician Referral3MCFamily/GeneralPracticeFalseincomplete45.007.0000diseases of the circulatory systemdiseases of the circulatory systemdiseases of the circulatory system5O-14.9FalseNaNNaNFalseFalseFalseTrueFalse
9692969289365.0119906946.0whitemale80.0NaNEmergencyDischarged to homeEmergency Room1MCOthersFalseincomplete3.00NaN000diseases of the circulatory systemdiseases of the circulatory systemdiseases of the circulatory system9O+14.5FalseNaNNaNFalseTrueTrueTrueFalse
9693969382858.0103427046.0hispanicfemale60.0NaNEmergencyDischarged to homePhysician Referral2unknownInternalMedicineFalsecomplete44.008.0001diseases of the circulatory systemsymptoms, signs, and ill-defined conditionsexternal causes of injury6B+13.5FalseNaN5.0FalseFalseFalseFalseTrue